coprocessors with a basic N-body simulation

نویسنده

  • Andrey Vladimirov
چکیده

Intel R © Xeon Phi TM coprocessors are capable of delivering more performance and better energy efficiency than Intel R © Xeon R © processors for certain parallel applications. In this paper, we investigate the porting and optimization of a test problem for the Intel Xeon Phi coprocessor. The test problem is a basic N-body simulation, which is the foundation of a number of applications in computational astrophysics and biophysics. Using common code in the C language for the host processor and for the coprocessor, we benchmark the N-body simulation. The simulation runs 2.3x to 5.4x times faster on a single Intel Xeon Phi coprocessor than on two Intel Xeon E5 series processors. The performance depends on the accuracy settings for transcendental arithmetics. We also study the assembly code produced by the compiler from the C code. This allows us to pinpoint some strategies for designing C/C++ programs that result in efficient automatically vectorized applications for Intel Xeon family devices.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Design, Implementation, and Experimental Evaluation of Coprocessor Architectures for Fast Qualitative Simulation

This dissertation presents design, prototype implementation, and the experimental evaluation of coprocessor architectures for fast qualitative simulation. The main goal of this work is to improve the runtime of the qualitative simulator QSIM. In qualitative simulation, physical systems are modeled on a higher level of abstraction than in other simulation paradigms | like continuous simulation. ...

متن کامل

Control software for reconfigurable coprocessors

On-line data processing at the ATLAS general purpose particle detector, which is currently under construction at Geneva, generates demands on computing power that are difficult to satisfy with commodity CPU-based computers. One of the most demanding applications is the recognition of particle tracks that originate from Bquark decays. However, this and many others applications can benefit from p...

متن کامل

An Analysis of an Interrupt-Driven Implementation of the Master-Worker Model with Application-Specific Coprocessors

In this thesis, we present a versatile parallel programming model composed of an individual general-purpose processor aided by several application-specific coprocessors. These computing units operate under a simplification of the master-worker model. The user-defined coprocessors may be either homogeneous or heterogeneous. We analyze system performance with regard to system size and task granul...

متن کامل

A Technology of 3D Elastic Wave Propagation Simulation Using Hybrid Supercomputers

We present a technology of 3D seismic field simulation for high-performance computing systems with GPUs or Intel Xeon Phi coprocessors. This technology covers adaptation of a mathematical modeling method and development of a parallel algorithm. We describe the parallel realization designed for simulation based on using staggeredgrids and 3D domain decomposition method. We study the parallel alg...

متن کامل

Towards simulation of subcellular calcium dynamics at nanometre resolution

Numerical simulation of subcellular Ca2þ dynamics with a resolution down to one nanometre can be an important tool for discovering the physiological cause of many heart diseases. The requirement of enormous computational power, however, has made such simulations prohibitive so far. By using up to 12,288 Intel Xeon Phi 31S1P coprocessors on the new hybrid cluster Tianhe-2, which is the new numbe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013